Skip to content

perf: valTag dispatch for O(1) materializer type routing#682

Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/valtag-dispatch
Open

perf: valTag dispatch for O(1) materializer type routing#682
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/valtag-dispatch

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 5, 2026

Motivation

The materializer uses isInstanceOf / pattern matching to determine the runtime type of each Val before converting it to ujson.Value. This creates a chain of type checks for every value materialized. For large JSON outputs (like realistic2 with thousands of values), this overhead accumulates.

Key Design Decision

Add a valTag: Byte field to Val that encodes the runtime type as a numeric constant, enabling O(1) dispatch via a lookup table or switch instead of sequential isInstanceOf checks. Each Val subtype sets its tag at construction time.

Modification

  • Val.scala: Added valTag field to Val base class, with constants for each subtype (TAG_STR, TAG_NUM, TAG_OBJ, etc.)
  • Materializer.scala: Changed type dispatch from pattern matching to tag-based switch
  • Test: Added valtag_dispatch.jsonnet covering all value types through materialization

Benchmark Results

JMH (JVM, 3 iterations)

Benchmark Master (ms/op) This PR (ms/op) Change
bench.02 50.427 ± 38.906 45.040 ± 1.141 -10.7%
comparison2 85.854 ± 188.657 85.570 ± 42.874 neutral
realistic2 73.458 ± 66.747 68.697 ± 4.175 -6.5%

Hyperfine (Scala Native, 10 runs, vs master)

Benchmark Master (ms) This PR (ms) Speedup
bench.02 75 ± 2 75 ± 2 neutral
comparison2 184 ± 3 184 ± 3 neutral
realistic2 303 ± 4 306 ± 4 neutral

Analysis

  • JVM: -10.7% on bench.02, -6.5% on realistic2 — HotSpot's polymorphic dispatch is slower than a tag-based switch for types with many subtypes
  • Scala Native: Neutral — LLVM devirtualization already handles the type dispatch efficiently at compile time
  • The optimization primarily benefits JVM workloads with heavy materialization
  • No regressions on any benchmark

References

  • Upstream exploration: he-pin/sjsonnet jit branch commit 30b7495b
  • Pattern: similar to tagged-union dispatch used in JDK and Rust implementations

Result

Consistent JVM improvement for materialization-heavy workloads. Neutral on Scala Native.

Add a valTag: Byte abstract method to Val with TAG constants (0-7) for
each concrete subclass, enabling JVM tableswitch O(1) dispatch in the
materializer instead of linear pattern matching.

Changes:
- Val.scala: Add valTag abstract method and TAG_STR/NUM/TRUE/FALSE/NULL/
  ARR/OBJ/FUNC constants (0-7 contiguous range)
- Materializer.scala: Replace pattern-match in materializeRecursiveChild
  with @switch tableswitch dispatch on valTag. Hoist xs.length out of
  materializeRecursiveArr while-loop.
- CustomValTests.scala: Add valTag=-1 to ImportantString (custom Val)

JMH improvements: reverse -2.9%, base64DecodeBytes -4.6%, comparison2 -1.9%,
base64 -2.1%. No regressions outside noise range.

Upstream: he-pin/sjsonnet jit branch commits 30b7495, 9ddb1a5
@He-Pin He-Pin marked this pull request as ready for review April 5, 2026 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant